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ABSTRACT 

We introduce a probabilistic approach to the problem of counting dwarf satellites around host 
galaxies in databases with limited redshift information. This technique is used to investigate the 
occurrence of satellites with luminosities similar to the Magellanic Clouds around hosts with properties 
similar to the Milky Way in the object catalog of the Sloan Digital Sky Survey. Our analysis uses data 
from SDSS Data Release 7, selecting candidate Milky- Way-like hosts from the spectroscopic catalog 
and candidate analogs of the Magellanic Clouds from the photometric catalog. Our principal result 
is the probability for a Milky- Way-like galaxy to host N sat close satellites with luminosities similar 
to the Magellanic Clouds. We find that 81 percent of galaxies like the Milky Way have no such 
satellites within a radius of 150 kpc, 11 percent have one, and only 3.5 percent of hosts have two. 
The probabilities are robust to changes in host and satellite selection criteria, background-estimation 
technique, and survey depth. These results demonstrate that the Milky Way has significantly more 
satellites than a typical galaxy of its luminosity; this fact is useful for understanding the larger 
cosmological context of our home galaxy. 

Subject headings: galaxies: dwarf — Magellanic Clouds — Local Group — galaxies: statistics — dark 
matter 



1. INTRODUCTION 

Our home galaxy, the Milky Way, is in many respects 
the best studied galaxy in the Universe. There are nu- 
merous measurements that can only be made in the 
Milky Way, including detailed studies of resolved stellar 
populations and the detection and dynamical measure- 
ments of the faintest satellite galaxies. Furthermore, the 
Milky Way is a critical testbed for dark matter stud- 
ies, as it is one of the only places where self-annihilation 
or weak interactions can be directly observed. As such, 
studies of the Milky Way have long provided key insights 
into aspects of cosmology and galaxy formation. 

To fully interpret this panoply of observations of the 
Milky Way (MW) in the context of a cosmological model, 
it is essential to understand whether or not the MW 
is a typical galaxy of its mass or luminosity. One of 
the most cosmologically interesting statistical properties 
of the MW that can be readily studied is the number 
and properties of its satellites. It has been apparent for 
more than a decade that N-body simulations of Galaxy- 
sized (i.e., M ~ 10 12 M Q ) dark-matter halos predict an 
abundance of low-mass subhalos that exceeds the ob- 
served population of M W dwarf satellites by more than 
an ord er of magnitude (Moore et al.||1999| |Klypin et al. 
1999bl for recent reviews 



sec 



Bullock||2010| IKravtsoy 



2010); this is the so-called "missing satellites problem." 
A number of different theoretical solutions to this prob- 
lem have been proposed, focusing either on reducing the 
small-scale power in the dark-matter power spectrum, or 
on s uppressing galaxy formation in low-m ass halos (see 



e.g., Madau et al.|20 08; Busha et al. 2010, and references 



1 present address: Institute for Theoretical Physics, Unviersity 
of Zurich, 8057 Zurich, Switzerland 



therein) . 

In the last several years, it has become apparent that 
some of this discrepancy was due to galaxies that had not 
yet been observed. The unprecedented deep, wide-field 



imaging fro m the Sloan Digital Sky Survey (SDSS; York 
|et al.||2000[ ) has yielded detections of a substantial num- 
ber o r previously unknown dwarf companions t o the MW 



(e.g., Belokurov < 



et al.|2007l|Walsh et al.|2009| ). This has 
sment of the missing satellites problem 



led to a reassessment of the missing satellites problc 
from the observational side, with the result that proper 
accounting for the de tectability of MW dwarf satellites 
( Koposov et al.|[2008[ ) results in substantial upward cor- 
rections to the satellite luminosity function. This leads 
to the prediction that the MW hosts hu ndreds of faint 
satellites that are curre ntly undetected (Tollerud et al. 
2008"! |Walsh et~^[2009l ). 

By contrast, there has been some indication that the 
missing-satellites problem might reverse itself at high 
masses: high-resolution Galactic-halo simulations gener- 
ally have too few subhalos in the mass range of the Large 
and Small Magella nic Clouds (LMC and SMC) (e.g. 



Madau et al. 20081. However, it has been difficult to 
draw robust conclusions on this point, since the numbers 
of high-mass subhalos, such as those that might host the 
Magellanic Clouds, are few in number for any individual 
Galactic-halo simulation, and in any case the abundance 
of such massive subhalos might be suppressed by the 
limited number of long-wavelength density-fluctuation 
modes in a small simulation volume. 

With the completion of recent high-resolution N-body 
simulations over cosmological volumes, such as the Mil- 
lennium II and Bolshoi simulations ( Boylan-Kolchin 



et al. 2009 Klypin et al. 20101, it has become possi 



ble to probe analogs of the MW-LMC-SMC system with 
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greatly improved statistical significance, since these sim- 
ulations can resolve LMC- and SMC-mass sub halos in 
large numbers of MW -mass halos. For example, |Boylan 



Kolchin et al. (2010) found that MW-mass halos (with 
Mhaio ~ \Q lz Mq) seldom host subhalos that can be iden- 
tified as analogs to the Magellanic Clouds, with less than 
10% of MW-sized halos hosting two such subhalos. Sim- 
ilar results arc seen in the Bolshoi simulation, as will 



be discussed in a companion paper to this one (Busha 
et al.||2010b|). It thus appears that the MW-LMC-SMC 



system is somewhat atypical in the context of the Cold 
Dark Matter paradigm. 

On its face, this theoretical result seems to be mildly 
anti-Copernican, so it is especially important to confront 
it with observational data. But any satellite-counting 
exercise is classically complicated by the faintness of 
the satellites, which makes obtaining redshift informa- 
tion difficult. Redshift-space studies of satellite dynam- 
ics that have aimed t o probe galaxy halo masses (e.g., 
Zaritsky et "aLl |T993l IPrada et al] [20031 IConroy et al.| 
2005| |2007p "or profiles (e.ff. |Chen et al.||2006] have typ- 
ically had redshift information for ^ 1 satellite per host 
galaxy, even while using criteria for satellite selection 
that are significantly more relaxed than we would like 
to use to s elect LMC/SMC analogs. Indeed, as discussed 
in Section [2~4] below, even the vast SDSS spectroscopic 
database contains less than 100 MW-like systems that 
host one or more MC-like satellites with redshifts. 

A common method for overcoming the lack of spectro- 
scopic information for satellites has been to count candi- 
date satellites around bright hosts in photometric data 
and to statistically subtract or correct for the contribu- 
tion of foreground and background objects (hereafter, we 
will simply use the term "background" as a shorthand to 
refer to both foreg round and backg round objects). In 
a pioneering paper, Holmberg ( 1969 ) found that nearby 
bright spiral galaxies, with a rather wide range of lu- 
minosities, typically hosted between and 5 sate llites 
brighter than an ab solute magnitude around —10.6. |Lor-| 
rimer et al. (fl994) carried out a similar analysis and 
found that galaxies brighter than Mb t — —18.5 host 
1.1 satellites in the range —16 > Mb t > —18, on aver- 
age, with fewer satellites around spirals (0.5 on average) 
than ellipticals (1.8). Both of those studies accounted for 
background contamination by subtracting off the average 
number of faint galaxies in nearby fields from the counts 
around bright galaxies; therefore, they were only able to 
measure the average number of satellites around bright 
galaxies; as discussed in Section [3j they cannot address 
the probability of hosting a certain number of satellites, 
which is what we have set out to do here. 



Recently, James & Ivory (2010) carried out a quasi- 
spectroscopic analysis by targeting 143 bright galaxies of 
known redshift for follow-up with narrow-band imaging 
centered near the expected wavelength of Ha at the red- 
shift of the target galaxy. This allowed them to count the 
number of roughly MC-like star-forming objects within 
a few hundred km s _1 of each host, yielding a plausi- 
ble measurement of the satellite number-count distribu- 
tion. In broad agreement with simulations, they find that 
roughly two-thirds of their target galaxies have zero such 
satellites, while only ~ 5% have two. This clearly con- 
firms that the Magellanic Clouds are indeed quite rare. 



There is significant uncertainty in the details, however, 
owing both to the small sample size and to the width of 
the imaging band, which will detect galaxies in Ha up 
to 30 Mpc away from the host along the line of sight, so 
the potential for significant background contamination 
remains^] In addition, comparison to the subhalo popu- 
lation in simulations is difficult, since only star- forming 
galaxies are selected. 

In this paper, we employ the enormous statistical 
power of the SDSS to obtain a statistically robust result 
for the frequency of LMC and SMC analogs in galaxies 
like the MW. We use the main SDSS spectroscopic cat- 
alog to identify a sample of > 2 x 10 5 isolated galaxies 
with luminosities similar to the MW. As in the studies 
above, we then count photometric companions around 
host galaxies with known redshifts, but here we introduce 
a new technique for statistical background removal that 
allows us to recover the true probability distribution of 
satellite number counts around these hosts, P(N sat ). To 
do this, we make use of the fact that our measured num- 
ber counts represent a convolution of the true satellite 
distribution with the distribution of background counts. 
We can also measure the latter distribution in the data, 
and then a simple deconvolution yields the desired re- 
sult. We pay careful attention to the details of our 
background-estimation techniques to ensure that we ac- 
count for all possible sources of systematic error, par- 
ticularly those arising from the clustering of background 
galaxies with our hosts. 

Our principal result is that only 3.5% ± 1.4% of galax- 
ies with luminosity similar to the MW host two satel- 
lites similar to the Magellanic Clouds within a radius of 
150 kpc. When we split the sample into red and blue 
hosts, we find that excluding red-sequence galaxies from 
our sample of hosts has no significant effect on the prob- 
ability of hosting any number of bright satellites. This 
confirms that the MW-LMC-SMC system is indeed quite 
unusual compared to the population of isolated galaxies 
with similar luminosity and color. These results are also 
broadly consistent with the predictions from simulations 
( |Boylan-Kolchin et al.|2010| ). We present a detailed com- 
parison with the predictions of hig h-resolution ACDM 
simulations in a companion paper ( pusha et aL||2010b| >; 
the general conclusion from both simulations and obser- 
vations is that the MW-LMC-SMC system is quite rare. 

This paper is structured as follows. First, we describe 
the SDSS dataset and our criteria for selecting ana logs 
to the MW, LMC and SMC in Section [2l In Section [2^4] 
we perform a preliminary analysis on SDSS galaxies with 
spectroscopically identified satellites. Finding it difficult 
to cleanly interpret these results, we then move on to de- 
velop our photometric background subtraction method 
in Section |3j being careful to fully account for the sta- 
tistical and systematic error budget. We perform the 
counting and background-correction exercise in two dif- 
ferent ways, which have different approaches to handling 
systematic errors. We obtain similar results for these two 
different procedures, which we present in Section [5] In 
that section, we also discuss the sensitivity of these re- 
sults to a number of assumptions in the analysis. We also 

2 Indeed, as we will discuss, even in the case of perfect spec- 
troscopic information, redshift-space distortions lead to significant 
contamination from interlopers, which must be accounted for. 
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perform an additional analysis in the deep photometric 
data from SDSS Stripe 82 to test the robustness of our 
analysis. We discuss the implications of our results in 
Section |6] 

Throughout, distances and absolute magnitudes are 
calculated using a flat ACDM cosmological model with 
TO = 0.3. All distances quoted in this paper are given in 
physical (rather than comoving) units, and all distances 
and absolute magnitudes derived from SDSS data assume 
a Hubble constant of Ho — 70 km s _1 Mpc -1 . 

2. DATA AND SAMPLE SELECTION 

2.1. SDSS Catalogs 

We use data from the spectroscopic and imaging cata- 
logs of the Sloan D igital Sky Survey (S PSS) seventh data 
release (DR7; Abazajian et al. 2009). Because our po- 
tential MW-LMC-SMC analogs are at very low redshift, 
and because we require them t o be more than 500 kpc 
from a survey edge (see Section 2.2 ), this limits the area 
used for the analysis. The mam sample of MW-sized 
hosts is selected from among the spectroscopic targets in 
a contiguous, 3350 square degree section of the North- 
ern Galactic Cap. The deeper imaging in Stripe 82 (an 
approximately 280 square-degree strip along the celes- 
tial equator) allows for a second, much smaller sample of 
MW analogs extending to slightly higher redshifts in the 
southern sky (we analyze this data separa tely t o check 
the robustness of our methods; see Section 5.3.2). Spec- 
tra and fc-corrected luminosity values are taken fro m the 
NYU Value Add ed Galaxy Catalog (NYU-VAGC) plarT 
2005b| p| 



ton et al 



The r-band magnitude limit of the main (non-QSO) 
spectroscopic galaxy catalog is 17.77. We use this limit 
along with the PRIMTARGET type designation to iso- 
late only those members of the main catalog identified as 
galaxies. We refer to the resulting catalog as the spec- 
troscopic galaxy sample. Since we are interested in rela- 
tively faint satellites (between 2 and 4 magnitudes dim- 
mer than their hosts), the spectroscopic catalog alone is 
insufficient for our purposes. In order to collect ample 
statistics we must also use the SDSS photometric data, 
which is complete down to at least r ~ 21.5 for extended 
sources. 

Our general strategy is to select candidate MW-sized 
host galaxies from the spectroscopic sample and conduct 
our search for satellites around these objects within the 
deeper imaging catalog. From the NYU-VAGC we ob- 
tain the fc-corrected absolute magnitudes for potential 
hosts computed using spectroscopic redshift information. 
From the imaging catalog we obtain apparent magni- 
tudes of galaxies and photometric redshift information. 
In our core analysis we use photometric redshift prob- 
ability distributions, p(z), determined by Cunha et al 



( 2009 1 for the DR7 SDSS imaging catalog using an arti- 
ficial neural network algorithm. 

2.2. Selection of Milky Way-Sized Central Galaxies 

2.2.1. Luminosity Requirements 
As discussed in Section 12.31 we count candidate 



LMC/SMC analogs by looking for galaxies 2 to 4 mag- 
nitudes fainter than their hosts. Thus, we limit our pool 



of potential hosts to galaxies more than 4 magnitudes 
brighter than the limit of our photometric sample. Ob- 
jects dimmer than r — 21 in the imaging catalog are 
particularly prone to catastrophic photo-z failures, ow- 
ing to their large photometric errors and the sparseness 
of the available spectroscopic training set. We therefore 
limit the pool of potential MW-analog hosts to appar- 
ent magnitude r < 17 to avoid this uncertain regime in 
the photometric sample (in the deeper co-added stripe 
82 data we consider hosts as dim as r = 17.6). From this 
reduced spectroscopic sample, we select a statistically ro- 
bust set of MW-luminosity hosts as follows. 

The current best estimate for the absolute magnitude 
of the MW is My = —20.9 in the Vega photometric sys- 



van den Bergh 



2000). In order to translate this 



tern 

value to the SDSS photometric system, we convert to 
the AB system and also apply an appropriate magnitude 
correction to the SDSS r filter (which is the filter that 
has the strongest overlap with V) . To accomplish the lat- 
ter conversion, we compute estimated absolute °-°V and 



omp 

01 r-band magnitude^] for a large sample of SDSS spec- 
troscopic targets, using the kcorrect algorit hm, version 
kcorrect v4_l_4 ( |Blanton k. Roweis||2007[ ). We then 
compute the mean V — r color for galaxies within ±0.2 
magnitudes of the MW and apply this average correction 
to the measured My of the MW. Because the V and r 
bands overlap strongly, this correction is quite small (so, 
for example, splitting the sample by color before com- 
puting it will not make a significant difference). The 
resulting absolute 01 r-band magnitude of the MW is 
Mo.i.r = —21.2. We consider a galaxy to be a poten- 
tial MW-like host if it is within ±0.2 magnitudes of this 
absolute magnitude. 

It is worth noting in passing that the absolute magni- 
tude of the MW is difficult to measure and may be sub- 
ject to quite large uncertainties. Since these are rather 
difficult to quantify, we adopt a best-guess value for the 
MW luminosity here and defer study of the satellite pop- 
ulation's dependence on host luminosity to future work. 

2.2.2. Isolation Criteria 

Our aim is to count MC-like satellites around MW- 
like host galaxies. To ensure that the satellites we are 
counting are indeed hosted by MW analogs, we require 
that each candidate host, like the MW itself, is not itself 
a satellite of a more massive system ^] This criterion is 
simple to impose if we presume that there is a monotonic 
relation between dark-matter halo mass and galaxy lumi- 
nosity: we can then impose a radius of isolation (Ri SO ), 
within which no other similarly luminous galaxy may re- 
side. More specifically, a candidate host is eliminated 
if, within this region, (1) a galaxy brighter (in absolute 
magnitude) than Mhost + AMi SO is found within ±1000 
km s _1 of the host redshift, or (2) a galaxy brighter (in 
apparent magnitude) than mhost + AM !SO is found with 

4 The superscripts indicate the assumed redshift to which the 
colors fc-corrected; 0.1 is standard for the VAGC, while 0.0 is ap- 
propriate for the MW. We assume h = 0.7 in computing all fc- 
corrected absolute magnitudes 

5 Although the MW and M31 are gravitationally bound, and 
M31 may be the more massive galaxy, the MW is not classified 
as a satellite of M31, since the two galaxies do not (yet) form a 
virialized system. 



3 http : //sdss .physics .nyu. edu/vagc/ 
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Fig. 1. — Purity and completeness of a mock sample of MW-like 
galaxies as a function of changing isolation radius, Ri ao . The mock 
galaxies are based on halos and subhalos in a high-resolution N- 
body simulation, with similar cuts applied as to the data sample. 
Over 95% purity is achieved for a cut of Ri ao = 0.5Mpc. 

no redshift information^] In our primary analysis, we set 
AMi SO = and reject only th ose c andidate hosts with a 
brighter neighbor. In Section |5.3[ we also consider the 
impact of a more stringent condition, requiring that our 
MW analogs have no close neighbors of similar bright- 
ness (up to AMi SO = 2), and we show that the results 
are insensitive to this detail. 

The exact choice of Ri SO naturally involves some trade- 
offs. The primary impact of varying Ri SO will be to 
change the purity and completeness of the sample of iso- 
lated hosts. Here, purity refers to the fraction of sur- 
viving MW analogs which are indeed isolated, that is, 
which are not satellites of more massive galaxies. Com- 
pleteness is the fraction of truly isolated hosts that pass 
through our isolation filter. To explore the effect of the 
Ri SO parameter on these statistics, we consider its impact 
on dark matter halos identified in a cosmological N-body 
simulation. 

Figure [l] shows the completeness and purity of our 
host sampl e as a function of Ri SO for Bolshoi (Klypin 



et al 



.|2010| , an N-body dark matter simulation based on 
Adaptive Refinement Tree (ART) code. This sim- 



the Adaptive Refinement Tree (ART) 
ulation assumed flat, concordance ACDM (Sl M — 0.27, 
A = 0.73, h = 0.7, and cr 8 = 0.82) and included 2048 3 
particles in a cubic, periodic box with comoving side 
length of 357 Mpc; the Bound Density Maxima algo- 
rithm was used to identify halo properties (Klypin et al. 
|1999a[ ). These parameters result in halo completeness 
limits of about v max > 50 km s _1 , which is small enough 
to include the MW's massive satellites and is well below 
the size of MW hosts. 

For comparison with observations, we identify MW- 
sized halos (in the range of 1O 12 M to 2 x 10 12 M o ) and 
compute the projected distance Rg to the nearest larger 
halo within a redshift range of ±1000 km s _1 (projected 
distances are calculated in the x — y plane, and redshift- 
space distances are calculated using halo positions and 
velocities along the z-axis) . A MW-sized halo is classified 
as a satellite if it is within the virial radius of a larger 

6 This happens occasionally, though infrequently, owing to fiber 
conflicts in the SDSS spectrograph. 



halo; otherwise, it is classified as a host. Thus, for a 
given Ri SO , the completeness is calculated as the fraction 
of MW-sized host halos with Rf > Ri SO , and the purity 
is the fraction of hosts within the set of MW-sized halos 
with R t > R lso . 

As one would expect, as Ri SO increases, our sample be- 
comes more isolated, and purity improves, but this is at 
the expense of rejecting truly isolated systems from our 
sample. Relatively high purity is important to us as it 
impacts the relevance of our results to the MW. How- 
ever, a more complete sample will improve our resilience 
to selection effects. The choice for R iso thus seeks to 
maximize completeness while holding impurities to an 
acceptably low level. The results of our N-body investi- 
gation are encouraging. An isolation radius of 500 kpc 
(which would count the MW as isolated, since M31 is 
~ 700 kpc distant) gives purity above 95%, while still 
permitting a completeness of ~ 85%. We therefore fix 
Ri SO at 0.5 Mpc for the core of our analysis and obtain 
a sample of 22,581 isolated MW analogs, extending out 
to z = 0.12 (in stripe 82, with fainter magnitude limit, 
we get 19 46 M W analogs out to z = 0.15). We show 
in Section |5.3| that our results are stable upon variation 
of the isolation radius, which implies that impurities at 
the few-percent level do not have a significant effect. We 
also require that all potential MW analogs be at least 
a distance Ri SO away from the edge of the observed re- 
gion. Since we are working at low redshift, the narrow 
southern SDSS stripes provide little useful area for our 
analysis, and so we neglect them. 

2.3. Analogs of the Magellanic Clouds 
The LMC and SMC are 2.4 and 3.8 mag nitudes fainter. 



respe ctively, than the MW in the V band (van den Bergh 
2000 1 . Since the V and r bands overlap strongly, similar 
magnitude differences will hold in r. To find analogs of 
the LMC and SMC, we thus search for galaxies around 
our isolated hosts within an aperture of physical size R sa t 
on the sky, and with apparent magnitudes in the range 
"T-host + 2 to rrihost + 4. (We work in apparent mag- 
nitudes to enable the use of the full SDSS photometric 
catalog, since the magnitude difference between satellites 
and their hosts should be the same in apparent or abso- 
lute values.) 

The appropriate choice of R sa t is not entirely clear. 
The virial radius of simul ated dark-matter ha los simi- 
lar to MW is ~ 250 kpc ( |Busha et aLpOlOaD , so that 
might seem a reasonable value. Un the other hand, the 
LMC and SMC are both within 100 kpc of the MW (at 
distan ces of 50 and 63 kpc, respectively; |van den Bergh 
2000 1 , so if we truly want MW analogs, we might prefer 
a lower value of R sa t ~ 100 kpc. The closeness of the 
LMC and SMC is likely to be happenstance, however, 
and an overly restrictive value for R sa t could lead us to 
underestimate the abundance of MW-MC analogs. 

A further consideration in choosing R sa t is contami- 
nation from background objects. Our primary analysis 
searches for satellites in the photometric catalog relies 
on statistical subtraction o f bac kground contamination. 
As is discussed in Section |3.4[ before counting poten- 
tial satellites, we use photometric redshift information 
to exclude a large fraction of the background objects. 
This cut is necessarily quite conservative, however, to 
avoid excluding true satellites from our sample, so sig- 
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nificant background remains. The larger R sa t becomes, 
the higher the background contamination becomes, and 
the larger the statistic al no ise it induces in our results, 
as discussed in Section |4.1| We will thus want to keep 
R sa t as small as is reasonable, to maximize our preci- 
sion. For our primary analysis, we set R sa t = 150 kpc, 
which strikes a reasonable balance between these differ- 



ent considerations. In Section 5.3 we explore the impact 
of varying R sat and find it to be small but not insignifi- 
cant. 

Deblending is a final concern. If a satellite is very 
close to its host (either physically or in projection), the 
SDSS reduction pipeline might not identify it as a sepa- 
rate source. Thus, when we perform background subtrac- 
tion to obtain a spherical search volume for satellites in 
Section |3.2.2| below, we are actually considering a bead- 
shaped volume, with a cylinder of radius of order 10 kpc 
(roughly the radius of an MW analog) removed from the 
center. Since R sa t is much larger than the size of a typi- 
cal host galaxy, however, this cylinder represents < 1%, 
of the search volume ( Figure [2] gives a visual impres- 
sion of the relative distances involved). The impact of 
deblending on our results should therefore be negligibly 
small, and we neglect it in what follows. 

2.4. A preliminary analysis: Magellanic Clouds in the 
SDSS spectroscopic catalog 

In this section, we generate preliminary results working 
exclusively within the SDSS spectroscopic catalog. This 
data set includes only the brightest objects (r < 17.77) 
in the survey, for which spectra were obtained. Although 
these results will be subsumed by a more precise and sys- 
tematically robust result using photometrically selected 
satellites, we include the brief analysis to illustrate the 
conceptual simplicity of our main undertaking as well as 
to motivate the search in the deeper photometric catalog. 

Though the stated magnitude limit of objects in the 
spectroscopic catalog is r = 17.77, we trim the set at 
r = 17.60 to avoid selection complications near the com- 
pleteness limit that arise from recalibrations of the pho- 
tometry since the main sample was selected. This limit 
applies to all satellites, which implies a minimum mag- 
nitude limit of r = 13.60 for hosts if we allow MC-likc 
satellites to be four magnitudes dimmer than their hosts. 

The brightest 199 members of the MW-sized galaxies 
sele cted using the host-finding procedure outlined in Sec- 
tion 2.2 (with Ri SO =0.5 Mpc, AMi SO — 0) have redshifts 



between 0.01 and 0.026 and r-band magnitudes between 
12.05 and 13.60 [SDSS units]. The search conducted 
around these 199 MW-like hosts identifies as MC-like 
any galaxy with (1) absolute magnitude (M v ) between 
two and four magnitudes dimmer than the magnitude of 
the host, that is (2) lying within a physical projected 
radius of 150 kpc of said host, and (3) has a redshift 
within Az max of the redshift of the host. The redshift 
difference Az max — 0.01 is equivalent to a ~ 300 km s _1 
velocity dispersion, chosen as a reasonable upper bound 
for the line-of-sight relative motion between a MW host 
and potential satellite. 

The value of Az max also sets the uncertainty in line- 
of-sight position of any potential satellite, such that the 
geometry described by our limits is not a sphere cen- 
tered on the candidate host but a cylinder with the same 
projected dimensions. The cylinder has a half-length of 



approximately 3 Mpc (as it happens, this is roughly the 
correlation length of an MW-luminosity host), and any 
interloper galaxies within it are indistinguishable from a 
true satellite. A systematic correction would be needed 
to convert this result to expected counts within the dc - 
sired spherical region with radius R sa t (see Section 4.2.2 ). 
We do not perform the correction here because the preci- 
sion of our results is already limited by our small sample 
size. SDSS fiber collisions will introduce a further source 
of systematic error for which we would need to correct, 
although this is likely to be small since the 55 arcsecond 
SDSS fiber-collision radius corresponds to only ~ 2% of 
the search cylinder for a typical spectroscopic host. A 
more careful analysis of the spectroscopic data in the 
case of LMC analogs is also in preparation by a differ- 
ent set of authors (E. Tollerud et al.). In any case, we 
will derive more precise, systematically corrected results 
from the photometric sample in what follows. 

Here, we simply quote the result for objects with MC- 
like properties found within the cylindrical redshift-space 
volume described above. This method, though failing to 
provide the desired search geometry, corresponds most 
closely to the results obt ained in other spectroscopic 



searches for satellites (e.g., Zaritsky et al 



.119931 
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James & 



Ivory [2010), with which our results are broadly consis 
tent. It also has the advantage of exact identification of 
individual correlated objects (unlike our results in what 
follows, which are purely statistical). Figure 15] shows a 
mosaic of some likely MW-MC-like systems identified in 
the spectroscopic catalog using this procedure. In all, 
from these 199 hosts, we find that 132 (or 66.3%) have 
zero, 51 (or 25.6%) have one, 16 (or 8.0%) have two, 
and none have more than two MC-luminosity galaxies 
within the search cylinder. This number-count distri- 
bution is compared with the equivalent results from our 
larger photometric samples in Figure [9j Even without 
careful systematic correction, this result stands as qual- 
itative confirmation of simulations (e.g., |Boylan-Kolchin 



et al.pOlOl |Busha et al]|2010b| showing that MW-like 
halos have two MC-like satellites less than 



10 percent 



of the time. 



3. METHODS 



In the last section, we motivated the need to move 
beyond the SDSS spectroscopic sample to obtain sta- 
tistically robust results. The easiest way to obtain a 
larger sample is to make use of the deeper SDSS pho- 
tometric catalog. Without precise redshift information, 
our analysis will depend on careful background subtrac- 
tion, since line-of-sight projection effects conflate actual 
satellites with background objects. In essence, we trade 
the ability to identify individual satellites around a host 
for a substantial incr ease in statistical power. As dis- 
cussed in Section |3.4| we can make the task somewhat 
easier by using photometric redshifts to exclude obvious 
background objects, but the photons do not have suf- 
ficient precision to identify line-of-sight interlopers on a 
system-by-system basis. We introduce here an ensemble 
treatment of background subtraction performed on our 
expanded set of MW hosts. 

Our desired result is the probability distribution func- 
tion p(S), the probability that iV sat MC-like galaxies are 
present within R sa t of an MW-sized galaxy. We arrive 
at our measurement via a four-step process, as outlined 
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Fig. 2. — Images of selected MW-like hosts with exactly two MC-like satellites in the SDSS spectroscopic catalog, identified as those 
objects within a radius of 150 kpc and within 300 km s~ 1 of the host. Each image is scaled to 300 physical kpc on a side, centered on 
the host galaxy. Satellites identified as MC-like companions are circled in yellow. The 1st, 2nd, 4th, and 11th images (counting from left 
to right, top to bottom) show at least one bright, close companion to the MW-sized host. Image 11 shows two such objects at the same 
redshift as the central galaxy. In each of these cases, the companion is recognized as a satellite of the host but is too luminous to meet 
our criteria for being an MC-like satellite. The 5th, 6th, 8th, 9th, and 11th images feature prominent background objects with spectra at 
dissimilar redshifts. Background objects without spectra are clearly visible in every panel. The 5th and 12th panels exhibit fiber collisions. 
The blue object next to the upper left MC-like satellite in panel 5, though bright enough, did not have its spectrum collected or analyzed, 
similarly, the object to the right of the bluer MC-like satellite in panel 12 has no redshift or absolute magnitude information due to fiber 
collisions. 
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below. 



We count the total number of galaxies, T, around 
each candidate host that meet our magnitude and 
projected an gula r distance (R sn t) selection criteria 
(see Section 
ized probabi 
PDF, p(T). 



2.3 We use these to build a normal- 



ity distribution, the composite counts 



We estimate the PDF of the background "noise" 
counts, p{B), by counting galaxies meeting our 
satellite criteria in fields that do not contain an 
MW- analog host. The selection of these noise fields 
has important implications for the systematic un- 
certainty in our final result, so we take two quite 
diffe rent approaches to constructing them (see Sec- 



tion 
case 



3.2), estimate the systematic errors in each 



and check that the results arc consistent. 



We extract the desired signal PDF p(S) via decon- 
volution. Assuming signal and noise to be inde- 
pendent, the distribution p(T) measured in stcp[T] 
is simply the convolution of p(S) with p(B); thus, 
a straightforward deconvolution in Fourier space is 
all that is necessary to reconstruct p(S). 

We estimate and correct for systematic effects that 
arise from catastrophic photo-z errors and from 
mis-estimation of the background contribution ow- 
ing to large-scale structure, finally arriving at our 
best estimate for p(N sat ), the probability of an MW 
analog's hosting N MC-like satellites. 



3.1. Composite Counts 

In Section |2.2[ we presented our process for selecting 
22,581 MW analogs. Each host serves as the center of an 
individual search aperture, whose angular size varies with 
the host redshift but always corresponds to a transverse 
physical distance R sa t- For each aperture, a tally is made 
of all objects which fit our criteria for LMC/SMC-like 
satellites (e.g., having apparent r-band fluxes between 2 
and 4 magnitudes fainter than the host, in our baseline 
analysis). The normalized histogram of these total num- 
ber counts is denoted by p(T); it represents our primary 
measurement in this study. 

3.2. Background estimation 

To estimate the PDF of background number counts, 
p(B), we take a similar approach to earlier studies (e.g., 
Holmberg|1969 Lorrimer et al.| [T994). In brief, we count 
galaxies that meet our satellite selection criteria, within 
comparison regions on the sky that do not contain a 
galaxy that meets our selection criteria for hosts (but 
that otherwise meet the isolation criteria). Previous au- 
thors taking this approach made use of data from photo- 
graphic plates, and they wisely used comparison regions 
on the same plate as their host galaxies, so the compari- 
son regions were quite nearby the hosts. In our case, we 
have access to a large, well-calibrated photometric survey 
field, so it is possible to choose comparison regions that 
are arbitrarily distant from the hosts. Since the estima- 
tion of background noise will be the dominant source of 
systematic error in this study, it is important to carefully 





Fig. 3. — Schematic diagrams of our background subtraction pro- 
cedures. (1) The two volumes corresponding to the center search 
aperture and the adjacent annulus are pictured. Red dots repre- 
sent objects within actual physical distance R sa t of the host, green 
dots are objects outside R sa t but correlated with the host, and 
grey dots show random foreground and background objects. Sim- 
ulations confirm that the amount of random and correlated back- 
ground objects in the two volumes are approximately equal. (2) 
The result for random background subtraction is shown. The ran- 
dom background has been subtracted, but correlated line-of-sight 
structures remain, resulting in an effective cylindrical search vol- 
ume (represented schematically by the green shaded region). (3) 
The result of annular background subtraction is shown. Both ran- 
dom and correlated line-of-sight objects have been subtracted. This 
is our best estimate of the desired result, the number of satellites 
within a radial distance R sa t of host. 

consider the choice of comparison fields. We take two dif- 
ferent approaches, which are subject to different sources 
of systematic error, in order to test the robustness of our 
results. 

3.2.1. Isotropic Background 

The simplest, most naive approach is to estimate the 
background from random locations on the sky. More 
specifically, we randomize the sky positions of our host 
sample within the SDSS NGC region. It is important 
for the sake of comparison that the search is performed 
on an identical distribution of aperture sizes and refer- 
ence magnitudes, however, so we do not randomize the 
host redshifts or luminosities. Approximately 25,000 ran- 
domized sky positions are generated and each is associ- 
ated with a set of object properties (absolute magnitude, 
apparent magnitude, redshift) belonging to a randomly 
chosen target host. 

These search centroids are then subjected to identical 
isolation conditions as the targets with an additional con- 
straint. As before, no search center may be within Ri SO 
of a brighter object than the host from which the search 
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parameters were derived. Now, in addition, no search 
center may be within 2R sat of any MW-sized galaxy that 
is within 1000 km s _1 of the search redshift, as we hope 
not to contaminate our noise profile with signal. The 
histogram of counts of MC-like objects around these lo- 
cations is then used to generate the isotropic background 
PDF, p{B lso ). 

This is likely to be an underestimate of the background 
around our hosts, however. Because galaxies are clus- 
tered, regions around hosts are generally denser than 
average, with a typical correlation length many times 
longer than R sa t of this study or even R V i r of a galaxy. 
Thus, though we have measured the random background 
noise, one would expect the total noise in our search aper- 
tures to be above random due to the contribution of pro- 
jected correlated galaxies outside our region of interest 
described by R sa t- 

If no correction is made for this effect, we have in 
essence counted all correlated objects within a cylinder of 
length roughly the correlation length r (see Figure [3]), 
which is clearly an overestimate of the satellite popu- 
lation. Fortunately, it is straightforward to compute a 
correction for this systematic undersubtraction, via inte- 
grals over the galaxy correlation function. We derive this 
in Section |4.2.2| and will apply it to our results derived 
using the isotropic background estimate. 

It is worth notin g, however, that a single correlation 
length (ro ~ 3Mpc; Zehavi et al. |2010 ) along the line of 
sight corresponds to ~ 300 km s in redshift space; that 
is, it is the same length as the search cylinder we used 
for our preliminary spectroscopic analysis in Section [2~4} 
In other words, the results derived from isotropic back- 
ground subtraction are roughly equivalent to our results 
in the spectroscopic catalog. Even in the case of perfect 
spectroscopic information, it is necessary to account for 
the presence of correlated objects along the line of sight 
if we wish to probe the true satellite population. Since 
we did not a ttempt to correct our spectroscopic result in 
Section |2.4[ we will also present our uncorrected results 
for isotropic background subtraction, for the purpose of 
comparison. 

3.2.2. Annular Background 

It is also possible to estimate the random and corre- 
lated background simultaneously and directly by placing 
our comparison fields very close to the MW-analog hosts. 
In particular, we can estimate the background noise by 
counting galaxies that meet our satellite selection crite- 
ria in an annulus around each host galaxy, but outside 
the initial satellite search aperture, provided that the an- 
nulus has the same projected area as the center search 
regi on. (This techniq ue is similar in spirit to the one 
that |Chen et~al . 2006 found to be optimal for interloper 
removal in spectroscopic data.) The histogram of counts 
in an annulus around each host then gives the correlated 
annulus background PDF, p(B ann ). 

A schematic diagram of this approach is shown in Fig- 
ure [3j panel (1). The central column, with a sphere of 
radius R sa t cut out of it, has a slightly smaller volume 
than an annulus of the same projected area, so the counts 
in the annulus will tend to be slightly enhanced relative 
to the central cylinder. But, counteracting this, there is 
also, on average, a lower density of correlated objects in 
the annulus, owing to the larger distance from the host. 



The two effects will cancel for some particular choice of 
the annular radius, although it is difficult to justify a 
priori a particular choice of this radius. 

We can make some use of N-body simulations to 
help guide our choice. In particular, we make use 
of a mock galaxy catalog generated from abundance- 
matching gala3cy_ljan™osities from the low-luminosity 
survey of Blanton et al. ( 2005a ) to dark matter halos in 
the Bolshoi simulation. With this catalog, we can per- 
form identical selection cuts on MW hosts (luminosity 
and isolation criteria) as we perform on observed SDSS 
galaxies. Then, we may compare the number of objects 
with luminosities similar to the LMC and SMC (i.e., 2-4 
magnitudes fainter than their host galaxy) in both cylin- 
ders with the inner spheres removed, and in hollow cylin- 
ders, as shown in panel (1) of Figure [3] 

The most natural choice for the background-estimation 
annulus is the region immediately outside the search 
aperture, with R 2 sat < r 2 ann < 2R 2 at , which we will call 
Annulus I. In this case, tests from our mock catalogs 
show that the counts in the inner and outer cylinder are 
roughly equal, with the counts in the annulus possibly 
exceeding the counts in the search aperture, but by no 
more than ~ 10%. If we move the annulus outward to 



1.5R 



2 ^ 2 

sat ' 



< 2.5R 2 at (which we will call Annulus 
II), the annulus counts in the simulation appear to un- 
derestimate the aperture counts slightly, but again by no 
more than ~ 10%. 

Because the N-body models give only a rough approx- 
imation of our measurements in SDSS, and because we 
would prefer not to rely too heavily on simulations for 
our observational results, we do not attempt to further 
optimize the radius of our search annuli. Instead, since 
our two annuli appear to tightly bracket the optimal one, 
we will take the results using Annulus I to be our pri- 
mary results, and we will compare to the results using 
Annulus II to estimate the size of the residual systematic 
uncertainty. 

3.3. Signal Extraction via Deconvolution 

We make the assumption that the number of actual 
satellites to be found around an MW-sized galaxy is un- 
related to the number of background objects which might 
be projected into the same aperture. That is S, the sig- 
nal, and -B, the noise are independent variables. Their 
sum is a third random variable, T = S + B. This im- 
plies that the probability distribution of T is just the 
convolution of the S and B PDFs: 

T 

p(T) = p(S) * p(B) ee ]T p(S')p(B' = T - S'), (1) 

S'=0 

where the * symbol indicates convolution. 

B y using the methods described in Sections |3.1 
and 3.2.2 we have precise measurements of p{T 
p(Bi so ), and p(B ann ) respectively. We are interested in 
p(S cor ), the probability of encountering S cor LMC/SMC- 
like correlated galaxies within a cylinder of radius R sa t 
centered on an MW-sized host. This is computed by 
deconvolving p(Bi SO ) and p(T). More importantly, we 
wish to obtain p(S sa t), the probability of encountering 
S sa t MC-like satellites within a sphere of physical radius 
R sa t around such a host. This can be derived by applying 
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a systematic correction to p(S cor ) (see Section 4.2.2) or 
by deconvolving p(B ann ) from p{T). 

The deconvolutions take place in three steps. First we 
transform into Fourier space using a fast Fourier trans- 
form (FFT; this is indicated by the operator T below). 
Then a convolution is simple multiplication: 



T(p(T))=T(p(B))-T(p(S)) 



(2) 



By rearranging this equation we obtain T(p(S)), and an 
inverse FFT retrieves p(S) in each case. For an example 
(and a preview of our results) , see the left-hand panel of 
Figure |7| There, the blue curve (p(T)) can be obtained 
by the forward convolution of the red curve (p(B)) with 
the green curve (p(S)) as in Equation [T] In practice, we 
have measured the red and blue curves and deconvolved 
them via Equation [2] to extract the green curve. 

3.4. Use of Photometric Redshifts 

As discussed below, the statistical errors in p(S) de- 
pend strongly on the typical number of background 
( "noise" ) galaxies in the search aperture. We can there- 
fore greatly improve the precision of our results by mak- 
ing use of photometric redshift information to exclude ob- 
vious background galaxies before we begin the satellite- 
counting exercise outlined above. Because photometric 
redshift estimates are highly prone to catastrophically 
large errors — especially for faint galaxies — we do not at- 
tempt to use photo-zs to identify the actual satellites of 
individual hosts; instead, we merely use them to make a 
conservative initial background cut. 

Best-fit photometric redshift v alues and p(z) probabil- 
ity distributions are computed by Cunha et al. ( 2009 ) for 
each photometric object and are made publicly available 
on the SDSS DR7 webpage. We make a cut in the imag- 
ing catalog on best-fit photo-z at some threshold value 
z p hot,max and exclude galaxies with higher photo- z's from 
our sample. Because photo- z estimates are prone to inac- 
curacies and particularly to catastrophic errors, any cut 
on Zphot will wrongly exclude some number of galaxies 
that are actually satellites at low redshift. This will in- 
troduce a systematic undercounting of satellites around 
MW-analog hosts. 

We would like our sample of low-redshift galaxies to 
be as free as possible of background objects, to re- 
duce our statistical errors, while also being highly com- 
plete, to minimize systematic errors. However, increas- 
ing z p hot, max increases the background noise and worsens 
our statistical errors, while reducing z p hot,max rejects an 
increasing number of true satellites and increases our sys- 
tematics. This trade-off is shown in Figure [4j where we 
have plotted the summed p(z) distributions for a rep- 
resentative set of possible MC-like galaxies with z p hot 
above and below 0.23. There is a tail of high-photo- 
z galaxies that are actually located at z < 0.12 (dot- 
ted curve) and hence ought to be considered as poten- 
tial satellites; their exclusion causes a systematic under- 
counting of satellites. There is also a large number of 
galaxies with z p hot < 0.23 that have true z > 0.12 (solid 
line); these act as background noise and contribute to 
the statistical errors. In the next section, we derive in 
detail the impact of these two sources of error and their 
dependence on the photo- z threshold. We find that a 
value of Zphot.max = 0.23 strikes a reasonable balance 
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Fig. 4. — Distribution of true redshifts for two galaxy samples 
divided by photometric redshift. The dashed histog ram indicates 
the average normalized p(z) distribution from |Cunha et aT7| ( |2009| ) 
for galaxies with z v hot i 0.23, and the solid histogram is the same 
distribution for galaxies with z p hot i 0.23. By comparing the am- 
plitude of the two curves at z < 0.12 it is possible to estimate 
the relative numbers of potential z < 0.12 satellites that are kept 
(solid line) and excluded (dashed line) by our photo-z cut. These 
differ by roughly an order of magnitude, so the required systematic 
corrections to our satellite counts are expected to be on the order 
of 10%. 

between the purity and completeness of satellites as the 
resulting statistical and systematic errors have roughly 
equal amplitude. 

4. ERROR BUDGET 
4.1. Statistical Errors 

The statistical uncertainty in our measured p(S) has 
three sources. The first is the overall size of our sample of 
MW-like host galaxies: as our sample size increases, we 
expect that the precision of our result should improve as 
well, owing to reduced Poisson noise. More specifically, 
the uncertainty in our measured composite counts PDF, 
p(T) should be purely Poisson at each value of T. The 
second source of error is noise from the background: if 
we increase our photometric redshift cut, we increase the 
number of background galaxies in our total composite 
counts and hence we reduce the signal-to-noise ratio of 
our final measurement. More specifically, the isotropic 
background PDF, p(B) is a source of noise that propa- 
gates through our analysis in Fourier space to our final 
measurements for p(S). Our sample size is large enough 
in our primary analysis that we are dominated by this 
second source of error. 

A final source of statistical error may arise from sample 
variance (also sometimes called cosmic variance). Al- 
though our observational regions are likely numerous 
enough that this is not a dominant source of error, it 
is possible that the variance in our composite or noise 
counts exceeds simple Poisson noise. To fully charac- 
terize the variance in our sample, therefore, we use the 
jackknife technique to estimate the errors on our mea- 
sured p(T) and p(B). We divide our spatially contigu- 
ous set of MW-sized hosts into 50 subsets, each of which 
may contain a different number of galaxies but occupy an 
equal area on the sky. Each iteration, a different subset 
is omitted, and 49 of the 50 tiles produce a normalized 
PDF of counts (composite or noise). The result is 50 
different values for each histogram bin. Their mean is 
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the unbiased PDF; error on the mean is approximately 
\Jn — 1 ■ <7i, where Oi is the standard deviation in each 
bin, i. 

It is possible in principle to propagate these uncertain- 
ties analytically through the Fourier analysis to obtain 
final errors on p(S). However, the scalings involved are 
rather non-intuitive, and the calculation is prone to nu- 
merical instabilities when the p{B) and p(T) distribu- 
tions are truncated at some maximum abscissa value, 
which is typically necessary. Therefore, we propagate 
uncertainties through the deconvolution using a stochas- 
tic approach. The same FFT deconvolution is performed 
approximately one million times, each time with a set of 
values for p(T) and p{B) randomly drawn from Gaus- 
sian distributions with the means and standard devia- 
tions found in the jackknife analysis. To mute the effects 
of ringing in the deconvolved result, we keep only those 
trials whose resulting probability densities are nonnega- 
tive everywhere. The median in each bin of the satellite 
counts PDF is our result for an MW-sized galaxy's prob- 
ability of hosting S — 0,1,2,3... MC-like satellites. Error 
bars bracket the 68% confidence interval. 

Following this procedure, we find that the stochastic 
error bars derived on p(S) are much larger than the error 
bars estimated for p(T) or p(B) in our jackknife analy- 
sis. This indicates that the error in p(S) is dominated by 
background noise, rather than counting statistics. When 
we are in this regime, increasing our sample size by a 
factor of order unity will not shrink our error bars as 
y/ri. Instead, our errors will thus scale roughly as the 
average signal-to-background-noise ratio, (S)/(B). To 
improve our errors, we would need to reduce the back- 
ground, for example by making a stricter photo-z cut 
(however, doing this would increase our systematic er- 
rors, as discussed in the next section). Because we are 
not limited by our sample size, we have taken an aggres- 
sive approach in our selection to excluding objects near 
the edges of the NGC region. 

To illustrate the scaling of our uncertainties, we com- 
pute our errors for different values of (S)/(B). We can 
directly obtain (T) and (B) from our basic number-count 
measurements. Regardless of the shape of the PDFs, this 
equation should then hold: 



(S) = (T) (B) 



(3) 



The most direct way of varying the signal to noise 
ratio is by shifting the maximum photo-z cut men- 
tioned in Section |3.4| Since the bulk of objects with 
Zphot > 0.12 are background objects, changing z ph ot,max 
changes (B) while holding (S) roughly steady. Between 
0.17 < Zphot, max < 0.29, our signal-to-noise ratio varies 
from approximately 0.27 to 0.15. For our adopted value, 

Z ph ot,max = 0.23, (S)/(B) = 0.18. 

Figure [5] plots the size of th e err or bars on p(S) (com- 
puted as aescribed in Section 4.1) against the choice of 
photo-z cutoff and resulting (SJJJB) for S = 0, 1, 2. The 
relationship between the photo-z cutoff and the statisti- 
cal uncertainty in our results demonstrates the need for 
a maximum photo-z limit on the Sloan imaging cata- 
log in our analysis. (S)/(B) also varies with the search 
aperture size R sa t , though this relationship is more com- 
plicated, since the average signal (number of satellites) 
depends on R sat as well. 
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Fig. 5. — Various sources of uncertainty in our analysis and their 
scaling with photo-z cutoff. The absolute statistical uncertainty 
on P(S) increases as the photometric rcdshift cutoff increases and 
the average signal to noise ratio decreases. This is shown by the 
colored solid lines, for S = 0,1,2. Here "signal" is the average 
number of MC-like satellites per Galaxy-sized host and "noise" is 
the background contaminating objects. A competing systematic 
effect arises from true satellites with inaccurate photo-zs, which 
are improperly rejected by our photo-z cut. The fraction of true 
satellites excluded in this manner is shown by the gray line, and 
the resulting systematic error is given by the colored dashed lines. 
Our adopted photo-z cutoff, indicated by the arrow, is chosen to 
approximately balance these two sources of error. 

4.2. Systematic Errors 

There are two primary sources of systematic error in 
our analysis. First, some fraction of true satellites will 
be subject to catastrophic photo-z errors and thus will 
be wrongly rejected in our background-exclusion cut. 
This will always cause a slight wiriercounting of satel- 
lites. Second, our isotropic background estimation in 
Section [3~2~T1 assumes that background galaxies are com- 
pletely uncorrelated with the MW-analog hosts. Since 
galaxies are in fact well known to be correlated, this tech- 
nique will lead to a slight overcounting of satellites from 
correlated objects along the line of sight. We address 
these two sources of systematic error in turn below. 



4.2.1. Photo-z losses 

T o est imate the error caused by the p hoto-z cut in 
Sec. |3.4| we use the p(z) information from Cunha et al.| 
(2009~r~fo compute a loss fraction, 77. This is the aver- 
age probability that any potential satellite object (that 
is, an object with the appropriate properties and actual 
redshift z < 0.12) will be cut out of our sample owing to 
a catastrophic photo-z error. 

Photo-z's of dimmer galaxies are more error-prone 
than those of brighter objects since the photometric er- 
rors are larger. Thus, an average loss fraction must 
be computed on a sub-sample of galaxies representative 
of the apparent magnitude distribution of our potential 
satellites. We construct this sample by iterating over 
our MW-analog hosts and, for each host, randomly se- 
lecting 1000 galaxies that are 2-4 magnitudes fainter and 
adding them to our sample (this means that individual 
faint galaxies will appear more than once in our sample, 
but this allows us to obtain the correct magnitude distri- 
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Fig. 6. — Completeness of z < 0.12 objects (i.e. the fraction of 
< 0.12 objects with z p h t < 2phot,max) as the maximum photo-2 
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bution). We operate on this set by dividing it into two, 
those objects with best-fit photo-z above the threshold 
z p hot,max, those objects with best-fit photo-z below this 
cut (see Figure El. 

Then, if L is the set of all potential satellites with best- 
fit photo-z below z p hot,max, the loss fraction is 



V = p{g?L\z g <0.12) 



(4) 



Since we do not have direct access to the actual redshifts, 
of individual photometric objects, we use Bayes's 



Theorem to rewrite the conditional probability in a more 
accessible form: 



ri=p(g£L\z g <0.12) 



p(z g < 0.12|«? g L)p(g ft L) 



p(z g < 0.12) 



(5) 



In Figure |1J L is represented by the solid line, and 
the set of all other objects, which we can call M, is rep- 
resented by the dotted line, so p{z g < 0.12|<7 fi L) is 
the integral under the dotted line between z = and 
z = 0.12. p(g ft L) is the ratio between the size of M 
and the size of the full set M U L, and p(z g < 0.12) is the 
integral from z — to z = 0.12 of the normalized p(z) 
distribution of M U L. 

The galaxy completeness after the photo-z cut, 1 — rj, 
is plotted against z p hot,max in Figure [6] For our primary 
results searching in the main Sloan imaging catalog with 
z p hot,max=0- 23, 77=0. 16. We thus expect the impact of 
this first source of systematic error to be at the ~ 10 
percent level. 

Given an estimate for 77, we can compute a straight- 
forward correction for the systematic error from photo- 
z losses. We relate p m eas(S), the measured probability 
distribution for MC-like satellites around MW-like hosts, 
to ptrue{N), the actual distribution, applying the overall 
loss-fraction, 77, uniformly as a loss probability for each 
satellite and including appropriate combinatorial factors. 
For a galaxy with N actual satellites, the probability that 
exactly m satellites will be lost is, 



P l oss (m\N)=T] m (l-r,) 



N-r 



(6) 



Then the measured satellite PDF is related to the true 



PDF by 



Ptrue{N = S + m)pio8s(m\N). (7) 



This equation corresponds to a formally infinite system 
of equations, one for each value of N. Since p m eas(S) 
and (presumably) ptrue {N) approach zero as N increases, 
however, we may solve for Pt rue (N) by truncating at 
some appropriately large values of N and 5 (chosen to 
be 15, well beyond where the average value is zero). This 
gives a tractable system of equations, which we then solve 
to obtain a result corrected for photo- z losses. 

4.2.2. Large-scale structure effects 

To estimate the impact of correlated structure along 
the line of sight, we would like to compute an anal- 
ogous quantity to the loss fraction, rj — we will call it 
the boost fraction, £ — that quantifies the fraction of our 
satellite counts that can be attributed to line-of-sight 
structure after we have made an isotropic background 
correction. To do this, we make use of the galaxy au- 
tocorrelation function £(r), which quantifies the excess 
probability above random of finding a galaxy some dis- 
tance r from another and which is well measured in the 
local universe. Strictly speaking, since we are consider- 
ing the correlations between two different galaxy pop- 
ulations, we should use the cross-correlation function of 
these two samples, but given that both MW-sized objects 
and LMC-sized objects should be roughly unbiased trac- 
ers o f the dark-matter distribution (e.g., Zehavi et al. 
2010 1, their cross-correlation and autocorrelation tunc- 



tions will be approximately equal. 

In particular, we can make use of the projected cor- 
relation function w p (r p ), which is given by integrating 
£(r) along the line of sight. This function gives the ex- 
cess probability (above random) of finding a galaxy at a 
projected distance r p away from another on the sky. At 
r p < R sa t, the dominant contribution to w p (r p ) is from 
true satellite galaxies, but there will also be some contri- 
bution from unbound galaxies along the line of sight. We 
can estimate the size of this contribution by integrating 
£(r) along the line of sight, excluding a sphere of radius 
R sa t around the origin and comparing t his to the full 
w p (r p ). Following Davis & Peebles (1983), the modified 
projection we wanfTTs 



w p (r p ) 



.(?>) 



rdT^r){r 2 -rl)- x l\ (8) 



where the lower limit of integration is 



1/2 



(9) 



and defines the sphere within which we wish to count 
satellites. This can be integrated numerically for a given 
choice of £(r). 

If we let R sa t ~^ and assume a power-law form for 
the correlation function, £(r) = (r/ro) -7 , we obtain the 
well-known analytic formula for w p (r p ), 



w p {r p ) 



r 



7-1 



v2/ 



(10) 
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Then, by integrating both Wp and w p out to R sa t , we can 
compute the probability that a satellite candidate is cor- 
related with its putative host, beyond what is accounted 
for by our isotropic background correction, but is not ac- 
tually within R sa t- This probability is the boost fraction 



C 



w p {r p )dr p 



w p (r p )dr p . 



(11) 



Assuming 7 = 1.1 
dimmer than L*; 



(app roximately the val ue for galaxies 
Zehavi et al.||2010 ), when R sat — 



e.g 



150 kpc we obtain £ = (J. 21. 

The probability that exactly n correlated galaxies will 
be counted along the line of sight is then 



Pboost(n) = C"(l - 0- 



(12) 



The second factor ensures that the probability distribu- 
tion is normalized (since the sum over n is a geometric 
sequence); it accounts for the probability of having zero 
correlated line-of-sight systems. The systematic correc- 
tion for correlated structure can then be derived as in 
the previous section, by relating the measured PDF to 
the true PDF: 



s 

£ 

n=0 



Ptrue{N = S - n)p boost (n). 



(13) 



We can solve this as before by truncating the formally 
infinite system of equations at suitably large S such that 
Pmeas(S) vanishes. In practice, we first compute the cor- 
rection for photo- z losses from Equation [JJ and then we 
compute the boost correction using the results of that 
calculation. This ensures that we account correctly for 
correlated non-satellite galaxies that were lost to photo-z 
failures. 

Before moving on, we make note of a possible inaccu- 
racy in the analysis in this section. We have assumed 
that £ does not depend on the true number of satellites, 
N. However, since C depends on the bias of the hosts, this 
may not be completely correct. One might imagine that 
the satellite population depends, to some extent, on the 
formation epoch of the hosts (since hosts forming earlier 
have more time to disrupt or merge with their satellites) . 
Galaxy biasing is also known to depend on formation 
epoch (the so-called "assembly bias"), a nd this effect is 
at the ~ 20% le vel for halos like the MW flWechsler et al. 
2006| ) . In fact, ( |Busha et aL]|2010b[ ) show explicitly that 
there is some dependence of the satellite number on en- 
vironment in this mass regime. £ will depend linearly 
on the host bias via the host-satellite cross correlation 
function. However, including this effect would compli- 
cate our analysis substantially: we would no longer be 
able to separate Equations [7J and 13 and we would have 
to write them as a double sum, yielding a much more 
complicated system of equations. Because the effect is of 
order 20% on top of a boost fraction that is of similar or- 
der, we treat it as a second-order correction and neglect 
it. 

5. RESULTS 
5.1. Primary Results 



TABLE 1 

Percentage of MW-luminosity host galaxies with N 

LMC/SMC LUMINOSITY SATELLITES WITHIN A SPHERE OF RADIUS 
150KPC, FOR N=0-6 



Satellite 
Counts 


Measured % 
of MW analogs 


Systematic Loss 
Adjustment 


Annulus Systematic 
Uncertainty a 


Zero 

One 

Two 

Three 

Four 

Five 

Six 


83.4+ii 
10.8±i;| 

n 1+1.3 

I 4+0-9 

7 +0.6 
1+ ' 2 


-2.0 
+0.8 
+0.4 
+0.2 
+0.4 
+0.2 
-0.1 


-4.2 
+2.6 
+1.6 
+0.1 
+0.2 
+0.1 
+0.1 



a This is our estimate for the maximum additional correction that 
might be required to account for having chosen a non-optimal annulus 
for background estimation. 



To compute our main results we use the parameters 
Ri SO ~0.5 Mpc, AM iso — (i.e., only rejecting galax- 
ies as non- isolated if they have a brighter companion), 
AM sat = 2 (searching satellites 2-4 magnitudes dim- 
mer than host), R sat —150 kpc, and z p h o t,max=0.23. The 
maximum photo-z value is chosen to yield random er- 
rors that are greater than or similar to the systematic 
errors fro m ph oto-z losses (see Figure [5]) , as discussed 
in Section 521 We note that our isolation and satellite- 
search parameters would select the MW-LMC-SMC sys- 
tem, since our nearest bright neighbor, M31, is 0.7 Mpc 
distant, and the LMC and SMC are both well within 
150 kpc of the MW. In what follows, we will vary these 
parameters to check the robustness of our results; we 
find the satellite counts to be relatively insensitive to the 
choice of parameters. 

In Figure [7] and Tables [T] and [2] we report the percent- 
age of MW-sized galaxies with N satellites or correlated 
objects centered on the host. N takes on integer values, 
and is labeled N cor for the result accomplished through 
isotropic background subtraction and N sat for the result 
achieved through annular background subtraction. 

Our annular background-subtraction technique gives 
our best estimate for the counts of dwarf satellites within 
a sphere of radius R sa t centered on each MW-sized host. 
We find the probability of there occurring N sat — 0, 1, 
and 2 bound MC-like satellites to be (81.4 ± 1.5) % , 
(11.6 ± 1.8) %, and (3.5 ± 1.4) % respectively, after ad- 
justment for systematic err ors ari sing from catastrophic 
photo-z failures (see Section 4.2.1[) . The measured values 
and systematic corrections are tabulated in Table [l] and 
plotted in the left-hand panel of Figure [jj (green curve 
and data points). Also plotted in that ngure are the 
composite counts PDF p(T) and the background PDF 
p(B) (blue and red curves, respectively) which are the 
curves that have been deconvolved to yield the measured 
satellite counts. 

We derived these results using the comparison region 
immediately outside our se arch aperture that we called 
Annulus I in Section 13.2.21 As we discussed in that Sec- 
tion, this may yield a very slight overestimation of the 
background, according to our tests in simulations. To 
quantify the potential size of this residual systematic er- 
ror, we also compute our results using Annulus II (which 
simulations suggest is likely to yield a very slight un- 
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TABLE 2 

Percentage of MW-luminosity host galaxies with N 

LMC/SMC LUMINOSITY CORRELATED OBJECTS WITHIN A CYLINDER OF 
RADIUS 150KPC, FOR N=0-6 



TABLE 3 

Satellite statistics of red- and blue-sequence 
mw-sized galaxies, using annular background 
estimation and after systematic adjustment. 



Correlated 


Measured % 


Systematic Loss 


Systematic Boost 


Objects 


of MW analogs 


Adjustment 


Adjustment 



Zero 


69.7+lf 


-3.8 


+16.5 


One 


21 1+ 1 '- 7 


+1.0 


-11.1 


Two 




+2.1 


-3.4 


Three 


1 3+ 10 


+0.2 


-1.8 


Four 


6+ ' 7 
u - D -0.5 


+0.3 


-0.1 


Five 


1+ 03 

"■ L -o.i 


+0.1 


-0.1 


Six 


1+ 02 


-0.0 


-0.1 



rferestimate of the background). We take the difference 
between these two results to be an estimate for the max- 
imum remaining systematic error in our primary results; 
we report this in the final column of Table [II 

The isotropic background correction yields counts of 
MC-like dwarf galaxies correlated with the host within a 
cylinder around the host with radius R sat and effective 
half-length of roughly the correlation length of unbiased 
mass tracers. For N cor — 0, 1 and 2, we find probabilities 
(64.6 ± 1.5) %, (22.8 ± 1.8), and (9.7 ± 1.5) %, respec- 
tively. These numbers have also had the systematic cor- 
rection for photo-z loss applied; the measured numbers 
and the corrections are tabulated in Table [2] and plotted 
in the right-hand panel of Figure lf\ (thick orange curve 
and open data points), along witn the com posite and 
background PDFs. As discussed in Section |3.2.1 this 
result is the most directly comparable to satellite counts 
measured in redshift space, if interlopers have not been 
accounted for, so we present it here for comparison to 
our results in Section [2~4l 



In Section [4.2.2| we developed a further systematic cor- 
rection to allow us to remove the effects of correlated 
line-of-sight structures from this result. We compute this 
correction for the results of the isotropic background sub- 
traction, and we give the results in the final column of 
Table [2j We also plot the corrected probabilities in the 
right panel of Figure [7] (solid orange points) and com- 
pare to the results of our annular background correction 
(green points), The good agreement between these two 
approaches gives us confidence that our methods are ro- 
bust. We note that a similar systematic boost correction 
could also be usefully applied to any future spectroscopic 
satellite searches to account for correlated interlopers. 

Our results compare favorably with data from recent 
high-resolution numerical N -body simulations, such as 
the Millennium- 2 simulation ( Boylan-Kolchin et al. 20101 
and the Bolshoi simulation. The latter agreement will 
be di scussed in more det ail in a companion paper to this 
one ( |Busha et al. 2010b). It is also worth mentioning 
that none of our measurements of p(N sat ) is consistent 
with a Poisson distribution with an expectation value of 
< N >— 0.3. A detailed discussion of this ca n be found 
in the companion paper ( Busha et al.||2010b ). 



5.2. Satellite populations as a function of host-galaxy 

color 

These results suggest that the MW, with two large, 
close satellites, is not a typical galaxy for its luminosity. 



Number of 


Red Galaxies 


Blue Galaxies 


Average 


Satellites 


P(N sat ) 


P(N sat ) 


P(N sat ) 


Zero 


82.0 


81.2 


81.5 


One 


11.6 


12.5 


11.7 


Two 


2.6 


3.5 


3.5 


Three 


2.1 


0.5 


1.5 


Four 


0.8 


1.3 


1.1 


Five 


0.3 


0.3 


0.3 


Six 


0.0 


0.0 


0.0 



Since the MW is a blue, star-forming galaxy, we can take 
the analysis one step further and investigate whether the 
number of satellites is a function of galaxy color. This 
may be quite worthwhile, since the SDSS sample is dom- 
inated by red galaxies, and this could complicate the im- 
plications of our study for the MW. Galaxy col ors in the 
local unive rse are well known to be bimodal (Strateva 
et al. 2001), and we can cleanly divide our sample into 
red and blue objects by cutting at u — r = 2.4. 

We repeat our analysis for the red and blue sam- 
ples separately, using annular background estimation, 
and we find no statistically significant difference between 
the satellite statistics of the two sets. The results are 
provided in Table [3j where systematic adjustments for 
photo- z losses have been applied to the numbers given 
(the adjustments were not applied in Tables [T] and [2j). 
This result appe ars to be at o dds with work by Lorrirner 
et al. ( |1994[ ) and Chen (2008), who found more satellites 
around early- type galaxies, on average, than around late- 
types. However, those studies considered a wider range 
in host luminosity than we have done here, and so it is 
likely that the early-type samples were skewed toward 
brighter magnitudes than the late-type sample. The fact 
that we find no significant difference in our larger sam- 
ple, which is limited to a narrow range in host luminosity, 
suggests that the earlier results may have mainly uncov- 
ered a trend with host-galaxy luminosity, rather than 
galaxy type. 

It is reasonable to wonder how our results change if we 
divide the satellite population by color, especially since 
the MCs are both blue, star-forming galaxies. However, 
since we do not have very accurate photo-z estimates for 
faint SDSS galaxies, we also lack good fc-corrections for 
these objects, and so their absolute colors are uncertain. 
In order to produce robust and reliable results on the 
color dependence of the satellite population, more accu- 
rate photo-z estimates would be required. We therefore 
do not attempt to perform this test here. 



5.3. Robustness of the Results 
5.3.1. Varying the selection and search criteria 

In this section we confirm the stability of our main re- 
sults for the probability of finding N sat MC-like satellites 
in a sphere of radius R sa t around an MW-sized host. We 
vary sev eral key parameters defined earlier in Sections |2.2| 
and |2.3[ R sat (the satellite search radius) , AM snt (the 
maximum satellite magnitude relative to the host), Ri SO 
(the host isolation radius) , and AMi SO (the host isolation 
relative magnitude limit). The first two parameters alter 
our definition of a MC-like satellite, while the latter two 
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Fig. 7. — Probability that an MW-sized galaxy hosts N sa t LMC/SMC-like satellites or N cor line-of-sight correlated structures. Left: 
Our primary result, p(N sa t) (solid grccnpoints) computed using annular background estimation, along with the various steps involved in 
deriving this result, described in Section[3] The blue curve shows the composite counts PDF, p{T) and the red curve shows the background 
counts PDF estimated from annuli around each MW analog. The green curve is the deconvolution of these two PDFs, p(S), and the green 
data points are the values for p(N sa t) after correction for photo-z losses. Right: Similar curves are shown, but now for the case of isotropic 
background estimation. Now the orange curve is the deconvolved p(S) and the open orange points are p(N cor ) after correction for photo-z 
losses. Solid orange points show our best estimate for p(N sa t) hi this case, after computing a correction for correlated structure along the 
line of sight. We find that these results compare favorably to the results from left panel (green points and curve), which suggests that 
our results are robust. The host and satellite selection parameters used in this analysis are R iao =0.5 Mpc, R sa t=150kpc, z p f lot max =0.23 
(77=0.16), and AM iM =0. 



change what is considered a suitable MW-like host. We 
vary each of these parameters over a reasonable range of 
selection criteria that might be expected to produce an 
approximate analog of the MW-LMC-SMC system. The 
results of this investigation are shown in Figure [8j where 
each parameter is varied in turn, while holding the other 
parameters fixed at their nominal values. 

As would naively be expected, more satellites are de- 
tected as we increase the satellite search radius R sa t- 
However, the satellite counts are remarkably flat out to 
R sa t = 200 kpc. If we very stringently require candi- 
date MCs to lie within R sat — 100 kpc of their hosts 
(as the LMC and SMC do) then slightly less than 3% of 
MW-sized galaxies host two MC-like satellites. On the 
other hand, if we expand the search radius to 200 kpc, 
this fraction becomes about 5%. Even expanding it to 
250 kpc (roughly th e virial radius of the MW derived in 
Busha et al.|2010"a ), the fraction of hosts with two satel- 
lites rises only to ~ 8%. This suggests that our analysis 
has largely captured the probability of true MC-analog 
satellites. 

To further test whether we have captured the full satel- 
lite population in our main analysis, we compute the 
mean N sat values in each of the radial bins shown in 
the upper left-hand panel of the Figure. Taking the dif- 
ference between these values, and assuming a spherical 
search geometry, we can then compute the number den- 
sity of satellites in bins of radius. Because this mea- 
surement is nonnegative by construction, we expect that 
stochastic noise will cause us to measure a positive value 
in each bin; however, once we have measured the satel- 
lite population as completely as is possible within the 
uncertainties, the measured number density at all higher 
radii will be consistent with zero. In performing this ex- 
ercise, we find that the measured average number density 
rises sharply below R sa t = 150 kpc and that it is roughly 
flat and consistent with zero at all higher values of R sa t- 



This confirms that our fiducial value of R sa t captures the 
MC-analog population as well as is possible within the 
uncertainties in our analysis. 

In addition, a slight upward trend in the N — 2 value 
appears at the la level as hosts become increasingly iso- 
lated from larger neighbors (upper-right panel). If this 
weak trend is real, it is most easily explained as an ef- 
fect of host formation history. More isolated hosts will 
have formed more recently, on average, so they will have 
had less time to disrupt or accrete their satellites, and 
so their satellite population will be enhanced relative to 
hosts in denser regions. 

Lastly, we note that there is very little trend with the 
satellite relative-luminosity criterion AM sat (lower left 
panel of the Figure), despite the fact that we are in- 
creasing the magnitude range considered by up to a fac- 
tor of two. Since the overall galaxy luminosity function 
is not particularly steep over this magnitude range, one 
might expect the satellite probability to rise substantially 
when we broaden this search criterion. However, there 
is no particular reason that the luminosity function of 
satellites of MW analogs should the same as the overall 
luminosity function in this range. Our results suggest in 
fact that it is not. We may conclude from this result that 
satellites brighter than the MCs are extremely rare. 

No other significant trends are observed under varia- 
tion of host parameters. Specifically, little to no change 
in the results is evident if we reject hosts with compan- 
ions slightly fainter than themselves. This is not par- 
ticularly surprising, since such galaxies constitute only 
around one quarter of the sample in our primary anal- 
ysis. Thus, we find our results to be quite robust to all 
significant and reasonable parameter changes; they are 
not simply an accident of the satellite search criteria we 
have chosen. 

5.3.2. The Stripe 82 co-added catalog 
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Fig. 8. — Sensitivity of the probability of hosting iV=0,l,2, or 3 satellites to changes in various selection parameters. In each panel, one 
selection parameter is varied and the others are held fixed at our nominal values that were used in Figure^] Our results for the nominal 
parameter values are shown as dotted lines. Top Left: Dependence of probabilities on satellite search radius around the host galaxy. Top 
Right: Dependence of probabilities on variation of the isolation radius around the host galaxy, Ri ao . Bottom Left: The allowed magnitude 
disparity between host and MC-like satellite, AM 3a i, is varied. Here, we search for satellites with magnitudes 2-4, 1.5-4, 1-4, 0.5-4 and 
0.1-4 magnitudes dimmer than host, plot is indexed by the changing minimum value. Bottom Right: Results with increasingly stringent 
host-neighbor relative brightness limit AMi so . 



We partially repeat the analysis for the deeper co- 
added data in the SDSS equatorial stripe (Stripe 82). 
The Stripe 82 catalog is not only deeper than the main 
SDSS imaging database (magnitude limit r s» 23.5) but 
has no spatial intersection with the Northern Galactic 
Cap, offering a disjoint set of objects with which we can 
verify the results. Because of the deeper photometric 
limit, we can consider potential MW-like hosts a bit dim- 
mer, near to the completeness limit of the main SDSS 
spectroscopic sample, r = 17.60 (whereas we were limited 
to r = 17, four magnitudes brighter than our photomet- 
ric limit, in the NGC). Even with this deeper magnitude 
cut, there are only 1946 MW-sized galaxies in Stripe 82 
that have spectra and meet our primary isolation criteria, 
compared to 22,581 in the NGC. This sample extends to 
slightly higher redshift: z = 0.15, rather than 0.12. 

Since the statistical power of this sample is limited by 
its small size, we choose to compute only one of the re- 
sults for comparison to the NGC sample. The simplest 
result to compute is the PDF of correlated galaxy counts, 
p{N CO r), calculated using isotropic background estima- 
tion. We have already shown that this result can be sys- 



tematically corrected to accurately recover p(N sat ), and 
there would be no changes to this correction procedure 
in the Stripe 82 data set, so comparing this one result 
should be sufficient. The isotropic-background result also 
has the advantage of being most directly comparabl e to 
the spectroscopic analy sis we performed in Section |2.4| 
(as explained in Section 3.2.1). 

All the methods described earlier apply to this analy- 
sis except for the specifics of our choice of z p hot,max and 
computation of the loss fraction rj. A careful photo- z cut 
is even more important here, as the sky density of photo- 
metric objects in Stripe 82 far exceeds that of the main 
SDSS imaging catalog in the north. Here, we use photo- 
metric rcdshifts computed for the full Stripe 82 co-added 
catalog (Reis et al. in preparation) usin g the neural- 
network approach of Oyaizu et al. (2008). Lacking full 
p(z) estimates for this sample, we compute r\ from a sub- 
set of the photo- z validation set matched to the appar- 
ent magnitude and redshift distributions of our MC-like 
satellites. Objects in the validation set have measured 
spectroscopic redshifts but were not used to train the 
photo- z algorithm; they are used to test the accuracy of 
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Fig. 9. — Probability of finding N cor correlated objects around 
a MW analog in a cylinder with radius R sa t = i50fcpc, computed 
using three different data sets. Grey: Results using hosts from 
the Northern Galactic Cap region of the SDSS spectroscopic main 
catalog and satellites from photometric main catalog. Blue: Re- 
sults using hosts from the Stripe 82 region of the spectroscopic 
catalog and satellites from photometric Co-added data. Orange: 
Results using hosts and satellites only from the NGC region of the 
spectroscopic main catalog. 

the photo-z estimates. We can make histograms of this 
data set to obtain p(z) distributions for different s ubse ts 
and then perform an analysis analogous to Section 4.2 to 
obtain the fractional photo-z loss, r\. We find that, for 
the Stripe 82 photo-z values, a cut of z p hot,max = 0.21 
corresponds to r\ = 0.15, which is acceptably small, so 
we use this maximum photo-z cut in our analysis. 

The p(N cor ) results obtained from Stripe 82 are in 
good agreement with those obtained usin g the photo- 
metric catalog in the NGC in Section |5.1| and with the 
results com puted using only spectroscopic information in 
Section 2.4 A comparison of the three p(N cor ) measure- 
ments is shown in Figure [9l Since disjoint data sets yield 
statistically identical results despite covering disparate 
ranges in rcdshift and apparent magnitude, and despite 
having photo-z-induced systematic errors computed with 
different algorithms, we can be confident that our results 
do not depend strongly on these details. 

6. CONCLUSIONS 

We have investigated the occurrence of dwarf satel- 
lites with luminosities similar to the Magellanic Clouds 
around host galaxies with environment and luminosity 
similar to the MW. Our analysis uses spectroscopic data 
from SDSS to identify isolated MW-like galaxies and 
then searches the SDSS photometric data for potential 
satellites between two and four magnitudes fainter than 
these hosts. The primary result, summarized in Table [T] 
is the probability distribution of hosting N sat satellites 
similar to the LMC and SMC. We find that, of our 22,581 
MW-luminosity host galaxies, 81% have zero satellites 
as bright as the Magellanic Clouds, 11% have one such 
satellite, and only 3.5% host two such satellites. 

The main source of uncertainty in our analysis is the 
presence of projected foreground and background objects 
along the line of sight to each MW analog. We correct for 
these in two stages. First, we reject most background ob- 
jects with a rough photometric-redshift cut, and then we 
statistically correct for the remaining background objects 
by comparing the counts around MW analogs to counts 



in areas of the sky that do not contain MW-like objects. 
Because the background-noise level (rather than the sam- 
ple size) is the dominant source of statistical uncertainty 
in our main analysis, the best potential for improving 
upon the precision of our results would come from an im- 
provement in the photometric rcdshift estimates of faint 
objects, which would allow a more stringent initial back- 
ground rejection. 

The specific manner in which the background- 
estimation fields are selected has important implications 
for our final results. Fields chosen at random positions 
on the sky do not account for the clustering of galax- 
ies, which will enhance line-of-sight projections around 
our chosen hosts, although an approximate correction 
can be calculated. We emphasize that such a correction 
is needed even when perfect spectroscopic information 
is available. Alternatively, the background fields may be 
chosen as annuli around each host, outside of the satellite 
search radius. This accounts for both random and corre- 
lated line-of-sight projections, and it is the technique we 
use for our main results. However, the optimum radius 
for these annuli is not entirely obvious, and this intro- 
duces a small additional systematic uncertainty into our 
results. Allowing for this systematic error, it is possible 
that the percentage of MW analogs that host two MC- 
analog satellites could be as high as ~ 5% (see Table JT1). 

Nevertheless, the clear qualitative conclusion is that 
the presence of the LMC and SMC makes the MW quite 
unusual among the population of galaxies with similar 
luminosity. This is broadly in agreement with earlier 
observational studies that found < 1 satellite on aver- 



age around typical bright galaxies jZaritsky et al. 1993 
Lorrimer et al.||1994[ |(Jhen et al.||2 00(j| |Ja mes fe Ivor^ 



ivory 



2010[ ). In fact, when we compute the average number 
of satellites from our measured p(N sat ) distribution in 
Table [TJ we find (N sat ) = 0.3, which is lower than the 
mean values reported in those studies. However, because 
previous authors considered a much wider range of host 
and satellite luminosities than we consider here, we do 
not expect more than qualitative agreement in any case. 
Similarly, we do not find a statistically significant dif- 
ference in p(N sat ) for red versus blue galaxies, and this 
appears to be in confl i ct wit h the results of |Lorrimer] 



et al. ( |1994[ ) and |Chen| ( 2008[ ) , who find that early-type 
galaxies host significantly more galaxies than late-types 
do. However, the broad range of host luminosities consid- 
ered in those earlier studies, combined with the different 
luminosity distributions of early and late types, means 
that the underlying trend they found could be with lu- 
minosity, rather than color. 

Our results are useful for understanding the larger cos- 
mological context of the Milky Way Galaxy. There is a 
striking agreement between our results and the predic- 
tions of recent high-res olution cosmological N-body simu- 
lations. For example, ( Boylan-Kolchin et al.|2010 1 found 
that MW-sized dark-matter halos hosted two MC-sized 
subhalos o nly < 10% of the time in the Millennium II 



simulation Boylan-Kolchin et al. (2009). Similar results 



are obtained with the Bolshoi simulation; the consistency 
between our results a nd the Bolshoi predi ctions will be 
discussed in detail byjBusha et al.| (|2010b|). This agree- 



ment constitutes an important confirmation of the cold 
dark matter paradigm for galaxy formation. 
Our result also indicates that the MW is somewhat 



How Common are the Magellanic Clouds? 



17 



unusual among galaxies of similar luminosity at least in 
terms of its satellite population. A major philosophical 
underpinning of our cosmology is the Copernican prin- 
ciple, which holds that we do not observe the Universe 
from any privileged position, except insofar as such a po- 
sition is required for our existence (e.g., our very atypical 
position on a rocky planet with an atmosphere). Since 
there is no obvious anthropic requirement on the number 
of bright satellite companions to the MW, it would not 
be unreasonable to wonder whether our results present a 
challenge to the Copernican principle. 

Applied to the expected properties of our home galaxy, 
a reasonable statement of the principle is that the Milky 
Way should be consistent with a galaxy chosen at ran- 
dom from the stellar-mass- weighted galaxy population at 
large. It is important to note that this does not necessar- 
ily mean that the MW should be "typical" in all possible 
respects. In particular, it is not particularly unexpected 
that a randomly selected galaxy will be a ~ 2a outlier 
by at least one measure. Moreover, there is now rea- 
sonably strong evidence that the LMC and SMC were 
recently accreted by the MW and are on their first pass 
throug h the halo (e.g., Besla et al. 2007 Busha et al. 
2010a). If this is true, then the presence oi the Magel- 
lanic Clouds may be a transient event in the formation 
history of the MW, implying that the MW is not fun- 
damentally unusual in any way. Thus, we may conclude 
that the unusually large population of bright satellites 
around the MW can likely be ascribed to happenstance 
and presents no special challenge to our basic cosmolog- 
ical paradigm. 
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